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Abstract We study a class of games with a continuum of players for which 
Cournot-Nash equilibria can be obtained by the minimisation of some cost, 
related to optimal transport. This cost is not convex in the usual sense in 
general but it turns out to have hidden strict convexity properties in many 
relevant cases. This enables us to obtain new uniqueness results and a charac- 
terisation of equilibria in terms of some partial differential equations, a simple 
numerical scheme in dimension one as well as an analysis of the inefficiency of 
equilibria. 
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1. INTRODUCTION 

Since Aumann's seminal works Aumann [1964, 1966], models with a continuum 
of agents have occupied a distinguished position in economics and game theory. 
Schmeidler [1973] introduced a notion of non-cooperative equilibrium in games with 
a continuum of agents and established several existence results. In Schmeidler's own 
words: Non-atomic games enable us to analyze a conflict situation where the single 
player has no influence on the situation but the aggregative behavior of "large" sets 
of players can change the payoffs. The examples are numerous: Elections, many 
small buyers from a few competing firms, drivers that can choose among several 
roads, and so on. 

Following the approach of Kohlberg et al. [1974] to Walras equilibrium analysis, 
Mas-Colell [1984] reformulated Schmeidler's analysis in terms of joint distributions 
over agents' actions and characteristics and, in particular, the concept of Cournot- 
Nash equilibrium distributions. Not only Mas-ColelFs reformulation enabled him 
to obtain general existence results in an easy and elegant way but it is flexible 
enough to accommodate quite weak assumptions on the data (which is relevant in 
the framework of games with incomplete information and a continuum of players 
for instance). Roughly speaking, in analysing Cournot-Nash equilibria in the sense 
of Mas-Colell [1984] one can take great advantage of (topological but also geomet- 
ric) properties of spaces of probability measures. With this respect, it is natural to 
expect that optimal transport theory (which is an extremely active field in research 
in mathematics both from an applied and fundamental point, as illustrated by the 
monumental textbook Villani [2009]) may be useful. 
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Even though there are very general existence results for Cournot-Nash equilib- 
ria (see for instance Kahn [1989]) in the literature, we are not aware of classes 
of problems where there is uniqueness and a full characterisation of such equilib- 
ria which is tractable enough to obtain close-form solutions or efficient numerical 
computation schemes. One of our goals is precisely to go one step beyond abstract 
existence results (in mixed or pure strategies) and to identify classes of non-atomic 
games where Cournot-Nash equilibria are unique and can be fully characterised or 
numerically computed. 

Given a space of players types X endowed with a probability measure fj, £ V{X) 
(which gives the exogenous distribution of the type of the agents), an action space 
Y and a cost X x Y x V(Y) — > R, x-type agents taking action y pay the cost 
<&(x,y, v) where v £ V(Y) represents the action distribution. The fact that this 
cost depends on the other agents actions only through the distribution v means 
that who plays what does not matter i.e. the game is anonymous. A Cournot-Nash 
equilibrium is a joint probability measure 7 € V(X x Y) with first marginal fj, such 
that 

(1.1) j({(x,y) £ X x Y : $(x, y, v) = min $(x, z, v)}) = 1 

where v represents 7's second marginal. The probability 7 is naturally interpreted 
by saying that 7 (A x B) is the probability that agents have their type in A and an 
action in B. The equilibrium is called pure if, in addition, 7 is carried by a graph 
i.e. /i-a.e. the agents play in pure strategy. Condition (1.1) means that agents 
choose cost minimising strategies given their type and v so that, finally, imposing 
that v is the second marginal of 7 is a simple self-consistency requirement. 

In the sequel, we will restrict ourselves to the additively separable case where 
&(x,y,i/) = c(x,y) + V[^](y), which seems to be a necessary limitation for the 
optimal transport approach we will develop. Under this separability specification, 
the connection with optimal transport is almost obvious: if 7 is a Cournot-Nash 
equilibrium it necessarily minimises the average of c among probability measures 
having [i and v (which is a priori unknown) as marginals i.e. it solves the optimal 
transport problem: 



where n(/x, v) is the set of joint probabilities having [i and v as marginals. In an 
euclidean setting, there are well-known conditions on c and /1 which guarantee that 
such an optimal 7 necessarily is pure whatever v is and this of course implies purity 
of equilibria. 

If we go one step further and assume that V\y\ is the differential of some func- 
tional £ (see Section 3 for a precise definition) , it turns out that if v is a minimiser of 




inf 

7 en( M ,i/) 




XxY 
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£ [v] + W c (/i, v) and 7 solves (1.2) then it is a Cournot-Nash equilibrium. This gives 
a variational device to find equilibria: first find v by minimising 8[u] + W c (/i, v) 
and then find 7 by solving the optimal transport problem (1.2) between \x and v. 
This variational approach actually gives new existence results. To the best of our 
knowledge, usual general existence proofs are via fixed-point arguments and thus 
require a lot of regularity for the dependence of V[v\ with respect to in a com- 
pact metric setting, it is typically asked that v 1— > is continuous (or at least 
upper-semi continuous in some sense) from the set of probabilities equipped with 
the weak-* topology to the set of continuous functions equipped with the supre- 
mum norm. This is harmless if Y is finite but extremely restrictive in general, 
in particular it excludes the case of a purely local dependence which is relevant 
to capture congestion effects (actions that are frequently played are more costly). 
In contrast, the variational approach will enable us to treat such local congestion 
effects. If £ (the primitive of V in some sense) is convex then equilibria and min- 
imisers coincide and strict convexity gives uniqueness of the second marginal, 
of the equilibrium. Such a convexity is quite demanding in applications but we 
shall prove that in an euclidean setting and for a quadratic c (and more generally 
strictly convex c's in dimension one), there is some hidden convexity (in the spirit 
of the seminal results of McCann [1997]) in the problem from which one can de- 
duce uniqueness of equilibria but also a characterisation in terms of a nonlinear 
partial differential equation of Monge- Ampere type. This partial differential equa- 
tion cannot be solved explicitly in general but, in dimension one, it is easy to solve 
the variational problem numerically in an efficient way as we shall illustrate on 
several examples. Another advantage of the variational approach is that it allows 
for an elementary (in-)efficiency analysis of the equilibrium and the design of a 
tax system to restore the efficiency of the equilibrium (see Section 5). Of course, 
the variational approach described above presents strong similarities with the po- 
tential games of Monderer and Shapley [1996] and our framework is very close to 
that of Konishi et al. [1997] or LeBreton and Weber [2011] in the case of a finite 
number of players; however we are not aware of any extension of the analysis of 
Monderer and Shapley [1996] to the case of a continuum of players. 

Apart from our results on Cournot-Nash equilibria, another objective of the 
paper is to contribute to popularise the use of optimal transport in economics. 
Several recent papers have fruitfully used optimal transport arguments in such dif- 
ferent fields as hedonics and matching problems (Chiappori et al. [2010], Ekeland 
[2010]), multidimensional screening (Carlier [2003], Figalli et al. [2011]) or urban 
economics (Blanchet et al. [2012], Carlier and Ekeland [2007]). We believe that 
cross-fertilisation between economics and optimal transport will rapidly develop. 
This is why we have included in Appendix A some basic results from optimal 
transport theory which we hope can serve as a comprehensive introduction to this 
vast subject to an economists readership. 
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The present introduction would neither be complete nor fair without an explicit 
reference to the mean-field games theory of Lasry and Lions [2006a, b, 2007]. In- 
deed, our variational approach is largely inspired by the Lasry and Lions optimal 
control approach to mean-field games (that has some similarities with optimal 
transport), but also mean-field games theory enables to treat considerably richer 
situations than the somehow static one we treat here. Another line of research we 
would like to mention concerns congestion games (another example of potential 
games) and the literature on the cost of anarchy (see Roughgarden [2005] and the 
references therein) , indeed the variational approach we develop presents some simi- 
larities with the variational approach to Wardrop equilibria on congested networks 
and in both cases equilibria are socially inefficient. 

The paper is organised as follows. In Section 2, we introduce the model, define 
equilibria and emphasise some connections with optimal transport. In Section 3, 
we adopt a variational approach and prove that for a large class of interactions, 
equilibria naturally arise as local minimisers of a certain functional. Section 4 is 
devoted to further uniqueness and variational characterisation of equilibria results 
thanks to notions of displacement convexity arising in optimal transport, we also 
characterise the equilibrium via a certain nonlinear partial differential equation and 
compute numerically the equilibrium in dimension one. Section 5 concludes. The 
proofs as well as well as a presentation of various results from optimal transport 
theory which are used throughout the paper are gathered in the Appendix. 

2. THE EQUILIBRIUM MODEL 

The model consists of a compact metric type space X equipped with a Borel 
probability measure fi E V(X), giving the distribution of types, a compact metric 
action space Y, a reference 1 Borel non- negative measure mo, a continuous function 
c G C(X x Y) and interactions are captured by a map which to every action 
distribution v € V(Y) n £ 1 (mo) associates a function V[v] defined mo-almost- 
everywhere. Given an action distribution u, x-type agents taking action y then 
incur the additively separable cost c(x, y) + V[f](y). The unknown is a probability 
distribution 7 6 V(X x Y), with the interpretation that j(A x B) is the probability 
that an agent has her type in A and takes an action in Y, such a 7 induces as action 
distribution u, its second marginal which we denote v = ity#7- By construction, 
the first marginal of 7, ttx#1 should be equal to fi. Since we will be interested by 
efficiency (or rather inefficiency) properties of equilibria, we will also impose that 

lr The role of the reference measure mo is here to capture purely local congestion effects as in the 
examples below. In other words, we will require action distributions to be absolutely continuous 
with respect to mo. This departs from the common assumption that the cost is well defined for 
every action distribution and satisfies some strong continuity/semi-continuity with respect to the 
weak * topology of measures as in Mas-Colell [1984] or Kahn [1989]. 
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7 has finite social cost, where the latter is given by 
SC= // (c(x,y)+V[v](y))d 1 (x,y) 

J J XxY 

= c(x,y) dj(x,y) + / V[u](y) du(y) 

J J XxY JY 

since c is continuous the first term is finite for every 7, but the second requires the 
action marginal v to belong to the domain 

(2.1) V := {v € C x {m Q ) : V[u] e C\v)} = {v e C l (m ) : J \V[u)\ du < +00}. 

Cournot-Nash equilibria are then defined as follows. 

Definition 2.1 7 £ V(X xY) is a Cournot-Nash equilibria if its first marginal 
is jjL, its second marginal, v, belongs to T> and there exists ip € C(X) such that 

(2.2) c(x,y) + V[^](y) > (p(x) for all x € X and rriQ-a.e. y with equality 7-a.e.. 

A Cournot-Nash equilibrium 7 is called pure whenever it is carried by a graph i.e. 
is of the form 7 = (id, T)#/j, for some Borel map T : X — > Y . 

The previous definition is slightly different from that of Mas-Colell [1984] because 
we require the action distribution to be absolutely continuous with respect to 
mo, so as to take into account congestion effects as explained in the examples 
below. This makes the existence of equilibria nontrivial, indeed, when v 1— > V\y\ 
is continuous from (V(Y), w— *) to (C(Y), ||.||oo) (as is the case for instance when 
^M(y) = Jy^Cyj- 2 ) du(z) with <p continuous) standard fixed-point arguments 
immediately give the existence of Cournot-Nash equilibria but here, we do not 
have such regularity. 



2.1. Examples 

Holiday choice 

Let us consider a population of agents whose location is distributed according 
to some probability distribution fj, 6 V{X) where X is some compact subset of IR 2 
(say). These agents have to choose their holidays destination (possibly in mixed 
strategy). The set of possible holiday destinations is some compact subset of the 
plane Y (it can be X, a finite set, ...). The commuting cost from x to y is c(x, y). In 
addition to the commuting cost, agents incur costs resulting from interactions with 
other agents, this is captured by a map v 1— > V[u) that can be modelled as follows. 
A natural effect that has to be taken into account is congestion, i.e. the fact that 
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more crowded location results in more disutility for the agents. Congestion thus 
requires to consider local effects and actually imposes that v is not too concen- 
trated; a way to capture this is to impose that v is absolutely continuous with 
respect to some reference probability measure mo- Still denoting by v the Radon- 
Nikodym derivative of u, a natural congestion cost is of the form y i-4 f(v(y)) 
with / non-decreasing. In addition to the negative externality due to congestion 
effect, there may be a positive externality effect due to the positive social inter- 
actions between agents which can be captured through a non-local term of the 
form y f Y <Kj/) z ) du(z) where for instance <j)(y, .) is minimal for z = y so that 
the previous term represents a cost for being far from the rest of the population. 
Finally, the presence of purely geographical factors (e.g. distance to the sea) can 
be reflected by a term of the form y i— >■ v(y). The total externality cost generated 
by the distribution v combines the three effects of congestion, positive interactions 
and geographical factors and can then be taken of the form 



Technological choice 

Consider now a simple model of technological choice in the presence of external- 
ities. There is a set of consumers indexed by a type x € X drawn according to the 
probability fj,, and a set of technologies Y for a certain good (cell-phone, computer, 
tablet...). On the supply side, assume there is a single profit maximising profit firm 
with convex production cost F(y, .) producing technology y, the supply (equals de- 
mand at equilibrium) of this firm is thus determined by the marginal pricing rule 
p(y) = d u F(y,v(y)). Agents aim to minimise with respect to y a total cost which 
is the sum of their individual purchasing cost c(x, y) +p{y) = c(x, y) + d u F(y, v{y)) 
and an additional usage/maintenance or accessibility cost which is positively af- 
fected by the number of consumers having purchased similar technologies i.e. a 
term of the form J y 4>(y, z) o\v{z) where <fi is increasing in the distance between 




For v £ V(Y), let n(/i, u) denote the set of probability measures on XxY having 
H and v as marginals and let W c (/U, u) be the least cost of transporting /i to v for 
the cost c i.e. the value of the Monge-Kantorovich optimal transport problem: 





technologies y and z. 



2.2. Connection with optimal transport and purity of equilibria 




inf 




XxY 




OPTIMAL TRANSPORT AND COURNOT-NASH EQUILIBRIA 7 
let us also denote by n o (/z, v) the (nonempty) set of optimal transport plans i.e. 

II (//,i/) := {7 G : // c(x, y) d-y(x, y) = W c (//, v)}. 

J JXxY 

A first link between Cournot-Nash equilibria and optimal transport is based on 
the following straightforward observation. 

Lemma 2.2 If 7 is a Cournot-Nash equilibrium and v denotes its second marginal 
then 7 G n o (/i, v). 

PROOF: Indeed, let ip € C{X) be such that (2.2) holds and let n G IT(/i, u) then 
we have 

c(x, y) dr/(x, y) > / / (<p(x) - V[v\(y)) dr)(x, y) 

XxY JJXxY 



(p(x)dn(x)- / V[v\{y) dv(y) = 11 c(x,y) dj(x,y) 
x Jy JJxxY 

so that 7 € n o (/i, v). Q.E.D. 

The previous proof also shows that ip solves the dual of W c (/i, v) (see Ap- 
pendix (A. 3)) i.e. maximises the functional 



(p(x) d/j,(x) + / <p c (y) du(y) 
x Jy 

where <p c denotes the c-transform of 99 i.e. 
(2.3) ip c (y) := min{c(x, y) - ip(x)} . 

x£X 

In an euclidean setting, there are well-known conditions on c and \x which guar- 
antee that such an optimal 7 necessarily is pure whatever v is. It is the case for 
instance if \x is absolutely continuous with respect to the Lebesgue measure, c(x, y) 
is a smooth and strictly convex function of x — y (see McCann and Gangbo [1996] 
who extended the seminal results of Brenier [1991] in the quadratic cost case), 
or more generally, when it satisfies a generalised Spence-Mirrlees condition (see 
Carlier [2003] for details): 

Corollary 2.3 Assume that X = $7 where Q is some open connected bounded 
subset o/R d with negligible boundary, that fi is absolutely continuous with respect to 
the Lebesgue measure, that c is differentiable with respect to its first argument, that 
V x c is continuous onM. d x Y and that it satisfies the generalised Spence-Mirrlees 
condition: 

for every x £ X, the map y £Y 1— >■ V x c(x, y) is injective, 
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then for every v G V(Y), n o (^, u) consists of a single element and the latter is of 
the form 7 = (id, T)#/j, hence every Cournot-Nash equilibrium is pure. 



In this section, we will see that in many relevant cases, one may obtain equilibria 
by the minimisation of some functional over a set of probability measures 2 . The 
main assumption for this variational approach to be valid is that the interaction 
map V[v] has the structure of a differential i.e. that V[v\ can be seen as the first 
variation of some function u 1— > £\u\. In this case, the variational approach is 
based on the observation that the equilibrium condition is the first-order optimality 
condition for the minimisation of W c (//, v) + £\y\- 



The main assumption for the variational approach to be valid is that v 1— > V[u\ 
is a differential in the following sense: 

Definition 3.1 (Differential) Let V be defined by (2.1). The map v G V h-> V[v] 
is a differential onV ifT> is convex and there exists £: V — >• U. such that for every 
(p,u) G V 2 , V[v] G C l {p) and 



i.e. V\iA is the first variation of £ which we denote V\iA = — . 

dv 

Before going any further, let us consider some examples to illustrate the previous 
definition. 

Local term 

Let us consider first the case of a local dependence, again tuq is our reference 
measure and for v G V := ViY) n £ 1 (mo) and mo-a.e. y: 



for some continuous /. Assume first that / is bounded and define, for all v G T>: 



2 Note the analogy with the variational approach of Monderer and Shapley [1996] for potential 
games, i.e. games whose equilibria can be obtained by minimising some potential function. 



3. A VARIATIONAL APPROACH 



3.1. Lnteraction maps which are differentials 




V[v](y) = f(yMy)) 
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Then since F is Lipschitz in v uniformly in y, it easily follows from Lebesgue's 
dominated convergence theorem that V\y\ is the differential of £ on D. Now, rather 
assume that / satisfies the growth condition 

(3.2) «K-l)</(^)<^ a + l) 

for some a > 0, b > 0, a > 0, mo-a.e. y and every v > 0. For p = a + 1, 
the corresponding energy functional £ is then defined for all v G T> := V(Y) n 
£ p (too) as above by (3.1). Thanks to (3.2) and Lebesgue's dominated convergence 
theorem, V is the differential of £ on D, V[v\ G £ p (mo) (with p' the conjugate 
exponent of p i.e. p' = p/(p — 1)) as soon as v € T>. Apart from the technical 
growth condition (which is useful to apply Lebesgue's theorem and guarantee that 
V[y\v is integrable) we therefore see that local V's are differentials. In Section 3.3, 
we will treat local V's under a different Inada-like condition on / which is more 
customary in economics and will ensure that v remains positive hence simplifying 
the equilibrium/optimality condition. 

Non-local interaction term and the role of symmetry 
Let us now consider the case of (pairwise) interactions where V[v\ is defined by 

V[u](y) = J <t>{y,z) du(z) 

for some 4> G C(Y x Y). It is then natural to define the quadratic functional 

By expanding in e, £[v + e(p — v)], its differential is immediate to compute 

£{v + e(p - u)] - £[u] 
lim 

= \JJ <t>{y,z)[dv{y) d{p-v)(z)+ du(z) d(p-v){y)} 
= \ JJl<Kv> z ) + y)] Mz) d(p - v)(y) . 

So that 

Hence V is the differential of £ on V(Y) as soon as <\> is symmetric 3 i.e. <p(y, z) = 
(f)(z, y) (which is the case for instance if <j> is the function of the distance between y 

3 Let us remark that in the case of a finite number of players, the role of symmetry for the 
potential approach to work was already pointed out in LeBreton and Weber [2011]. 
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and z). Note that the assumption that V is a differential requires 4> to be symmet- 
ric. 4 Of course, one can combine the previous examples and consider a V which is 
the sum of a symmetric interaction term and a local term, such V's still have the 
structure of a differential. 



To prove that minimisers of (3.5) are equilibria, we first need to be able to differ- 
entiate the term W c (/x, v) with respect to u, this is possible thanks to Lemma A.l 
proved in Appendix A but it requires more structure on X, fi and c: in particular X 
is a connected subset of M. d , c is differentiable with respect to x and // is equivalent 
to the Lebesgue measure. 

Theorem 3.2 (Minimisers are equilibria) Assume that V[u] satisfies (3.4) with 
V = T , (Y)r\£ p (mo) for some p E [1, +oo[ and that the assumptions o/Lemma A.l 
hold true. If ' v solves (3.5) and 7 € n o (/x, u) then 7 is a Cournot-Nash equilibrium. 

See Appendix B.l for the proof. Let us mention however that the optimality 
condition for (3.5) is the following: there is a constant M such that 



3.2. Minimisers are equilibria 
Throughout this paragraph, we assume that 

(3.4) V[u] = — on V. 



We then consider the variational problem 



(3.5) inf^M where J^v] := W c (fi, v) + 8 [v]. 



(3.6) 




4 In a similar way, if we consider the case of higher-order interactions 



V[»)(y)= [ (My,.)d^ m =/ <f>(y,z 1 ,...,z m )dv(z 1 )... du(z m ) 




where <f> G C(Y m+ ) satisfies the symmetry relations 

(3.3) <t>(y,zi, ... ,2m) = 4>(zi,y, . . . ,z m ) = ■ ■ ■ = 4>{z m , zi, . .. ,y), 

for all (y,zi, . . . , z m ) € Y m+1 , then V is the differential of 
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where (p c is the c-transform of (p as in (2.3). 

To deduce an existence result from Theorem 3.5, assume that V[u] is defined for 
v G V(Y) n£ l (m ) by 

(3.7) V[u](y) := f(y, u(y)) + J <j>{y, z) du(z) 

where 4> € C(Y x Y) is symmetric, / is continuous and non-decreasing with respect 
to its second argument and satisfies the growth condition (3.2) for some a > 0, 
b > and a > 0. For p = a + 1, the corresponding energy functional is then defined 
for all v G V := V(Y) n £ p (m ) by 

£M = J F (V, v{y)) dm (y) + ^ ^ 2 <A(y, ^) dz^(y) dv(z) 

where F is defined by (3.1). The functional F(y, .) is convex and satisfies the growth 
condition 

a(p~V - u) < F(y, v) < bip' 1 ^ + v) . 

Hence V[v] G C p> (uiq) as soon as v € 2? and thus, by Holder's inequality, V[u]p € 
£ 1 (mo) for every p and u inT>. 

Corollary 3.3 (Existence of equilibria by minimisation) Assume that the as- 
sumptions of Lemma A.l hold, that v i— > V\v\ is of the form (3.7) where f and 4> 
satisfy the assumptions above, then (3.5) admits minimisers in V(Y) n C p (mo) so 
that there exists Cournot-Nash equilibria. 

The proof is given in Appendix B.2. Note that this in particular provides exis- 
tence of equilibria results for the holiday and technological choice model examples 
above. 

Remark 3.4 Under the assumptions of the previous corollary, one can prove 
that the minimisers are actually bounded: indeed let v be such a minimiser either 
v(y) = or v(y) > and for mo-a.e. such points by the optimality condition (3.6) 
and (3.2) one should have for some constant M 

a(v(y) a - 1) < f(y, v(y)) = M — <p c (y) - [ <j>{y, z) du(z) . 

Jy 

Since <p is a c-transform, it is continuous hence bounded on Y and the integral 
term is bounded since (ft is. We therefore have v G £°°(mo). 

Let us now emphasise the role of convexity in the variational approach. As ex- 
pected if E is convex, then finding equilibria and minimising are equivalent: 
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Proposition 3.5 (Equivalence in the convex case) Assume that the assumptions 
of Theorem 3.2 are satisfied. If £ is convex on V then the following statements are 
equivalent: 

• v solves (3.5) and 7 G n o (/i, v), 

• 7 is an equilibrium and v = Hy^l- 

If moreover £ is strictly convex the following uniqueness result holds: 

Corollary 3.6 (Uniqueness in the strictly convex case) Assume that the as- 
sumptions of Theorem 3.2 are satisfied. If £ is strictly convex then all equilibria 
share the same second marginal v. If in addition, the assumptions of Corollary 2.3 
are satisfied then there is at most one Cournot-Nash equilibrium. 

As an application, let us observe that if the assumptions of Lemma A.l are satis- 
fied and if V[f](y) = f(y, v{y)) with an / which is increasing in u and satisfies (3.2) 
then there exists a unique minimiser so the previous uniqueness result holds. This 
applies naturally to the technological choice equilibrium problem as well as to the 
holiday choice example with pure congestion or, more generally, in the case where 
the congestion effects dominate as explained below. 

In the case where 

£M = J F (y, v{y)) dm (y) + ^ j (J <j>(y, z) dv{y) dv(z) 

the second non-local term typically favours the concentration of v (when cf)(y, z) 
is increasing with the distance between y and z for instance) and it is not convex, 
while the congestion terms fosters dispersion and is convex. There may however 
be some compensation between the two terms that makes £ convex. For instance, 
by Cauchy-Schwarz inequality, the quadratic form 



v (y) dm (y) + <f>(y,z) du(y) du(z) 
Y J JY 2 

is positive definite hence convex as soon as 

<p 2 {y,z) dm (y) dm (z) < 1. 
Whence in this case, the uniqueness result of Corollary 3.6 applies. 

3.3. The case of Inada's condition 

We now consider the case where V contains a local congestion term that satisfies 
an Inada-like condition: 



(3.8) lim f{y) = —00 and lim f(v) = +00. 
v— >0+ v— !-+oo 
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This will imply that minimisers of (3.5) are positive mo-a.e.. The optimality con- 
dition <f c + V[v] = M, for some constant M, will therefore be satisfied mo-a.e. 
which implies the regularity of v. More precisely, let us consider the case where 
the interaction are given for v £ V(Y) D C l (mo) by the map: 



(for the sake of simplicity we have dropped the dependence in y of /) where 

• <p £ C(Y x Y) is symmetric, 

• / : (0, +oo) i— >• R is continuous increasing, locally integrable on [0, +oo) and 
satisfies the Inada condition (3.8). 

We then define F by F(0) = and F' = f so that F is strictly convex, continuous 
on [0,+oo) and C 1 (0, +oo), bounded from below and coercive i.e. F(y)jv — > +oo 
as v — > +oo. As before, for any v in V(Y) n £ 1 (mo), we define the associated cost 
functional 



The typical example we have in mind is f(y) = log(^) and F(v) = v\og u — u, 
or simply F(u) = vlogv since we are only dealing with probability measures. In 
this case, the domain of 6 consists of absolutely continuous measures with finite 
entropy. Again, we look for equilibria by solving the minimisation problem (3.5). 
The implication of Inada's condition on the inferiority of minimisers is given by 
the following: 

Lemma 3.7 (Existence and positivity of minimisers) Under the above assump- 
tions, the variational problem (3.5) admits solutions and if v is such a solution 
v > 5 mo-a.e. for some 5 > and v € £°°(mo). 

The proof (see Appendix B.4) relies on the fact that since /(0 + ) = — oo the 
functional £ abhors a vacuum. Now that we know that minimisers v exist and 
are bounded from above and bounded away from 0, under the assumptions of 
Lemma A.l it is easy to see, as in the previous paragraph, that they necessarily 
are equilibria and satisfy the optimality condition (3.6): 



for some constant M and where as usual (p is the Kantorovich potential between 
H and v and ip c is its c-transform. Note that this equality is true not only z/-a.e. 



(3.9) 
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but mo-a.e., one can then invert this relation to deduce that v coincides tuq with 
the continuous function 



In particular there exists equilibria that have a continuous representative. Re- 
lation (3.10) is however not very tractable in general since it involves the very 
indirect quantity (p c and an integral term. We will see in Section 4 how it can be 
simplified and reformulated as a nonlinear partial differential equation in the case 
of a quadratic cost. 

For the moment, the Inada condition has just enabled us to prove some fur- 
ther regularity properties of minimisers hence of some special equilibria. Let us 
summarise all this by: 

Theorem 3.8 (Main results under the Inada condition) Let V be of the form (3.9) 
where f and <j) satisfy the assumptions of this paragraph. If v solves (3.5) and 7 € 
n o (^,z^) then 7 is an equilibrium; in particular, there exists equilibria. Moreover 
any minimiser v of (3.5) is bounded and bounded away from and coincides mo- 
a.e. with the continuous function given by (3.10). 

//, in addition, £ is convex 7 is an equilibrium if and only if 7 £ n o (/x, v) where 
v solves (3.5). 

//, in addition, £ is strictly convex there is a uniqueness of the equilibrium second 
marginal v. 

4. HIDDEN CONVEXITY AND FURTHER UNIQUENESS RESULTS 

So far, our variational approach has enabled us to prove the existence of equilib- 
ria by the minimisation problem (3.5). However, the previous results are not totally 
satisfying since in general there might exist equilibria that are not minimisers and 
even if we are only interested in the special equilibria obtained by minimisation, 
optimality conditions like (3.10) are not tractable enough to provide a full char- 
acterisation. Under further convexity conditions that are quite stringent we have 
seen that equilibria necessarily are minimisers and obtained uniqueness of both. In 
the case where 



5 Inada's condition is actually not essential to obtain a relation of the form (3.10). Indeed, in 
the case of a power congestion function, f(y) — v a , a > 0, using the positive part function, one 
obtains a similar relation 



(3.10) 
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there is a competition between the convexity of the congestion term that favours 
dispersion and the non-convexity of the interaction term so that in general nothing 
can be said about the convexity of £ in the usual sense. We shall see however, 
that some convexity structure, more adapted to optimal transport, can be used 
to derive new uniqueness and characterisation results. The aim of this section is 
precisely to exploit some hidden convexity structure in one dimension and in higher 
dimensions when the cost is quadratic. This goal can be achieved thanks to the 
very powerful notion of displacement convexity (or some slight variant of it) due 
to McCann [1997]. In recent years, these notions of convexity, intimately linked 
to optimal transport, have proved to be an extremely useful and flexible tool in 
particular in the study of nonlinear diffusions, to our knowledge, this is the first 
time they are used in an economic context, see also Blanchet et al. [2012]. We 
refer to Appendix A for a very short presentation and Section 4.1 for a detailed 
exposition in the easier one-dimensional case. Much more on this rich subject can 
be found in the books Ambrosio et al. [2005], Villani [2003, 2009]. 



4.1. Hidden convexity in dimension one 

Let us start with the simple one-dimensional case where the intuition is easy 
to understand: the functional Jn is not convex with respect to v but it is with 
respect to T, the optimal transport map from fi to v. Let us take X = Y = [0, 1], 
mo is the Lebesgue measure on [0, 1], fi is absolutely continuous with respect to 
the Lebesgue measure, and assume that V[v\ takes the form: 

V[v]{y) = f{u(y)) + v(y) + [ <f>(y, z) du(z) 

J[0,l] 

and that 

• the transport cost c is of the form c(x,y) = C(x — y) where C is strictly 
convex and differentiable, 

• / is increasing, 

• v is convex on [0, 1] and <j) is convex, symmetric, differentiable and has a 
locally Lipschitz gradient. 

As already noted the corresponding cost 

£[u]:= ! F(u(y))dy + l ff <j>{y t z) du(y) du(z) + f v(y) dv(y) 
Jo z JJ[o,i] 2 Jo 

(with F' = f) is not convex in the usual sense in general and neither is the 
functional = W c (p, .) + £. 

However, we shall see that J„ has good convexity properties when one considers 
the following interpolation. Let (p, v) € V([0, l]) 2 then there is a unique optimal 
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transport map To (respectively Ti) from p to v (respectively from p to p) for the 
cost c and it is non-decreasing (see Villani [2003]). For t £ [0, 1], let us define: 

vt ■= T t #H where T t := ((1 - t)T + tTi) 

then by construction, the curve t h4 ft connects vq = v to v\ = p. 

Definition 4.1 A functional J : V(Y) — > M U {+00} is ca//ed displacement 
convex whenever t G [0, 1] 1— > i/f^t] 2s convex (for every choice of endpoints v and 
p), it is called strictly displacement convex when, in addition J\yt\ < (1 ~t)<J\ l A + 
t J[p\ when t € (0, 1) and p 7^ p. 

We claim that is strictly displacement convex; indeed, take (z/, p) two proba- 
bility measures in the domain of £ (which is convex by convexity of F), define v% 
as above and, let us consider the four terms in separately: 

• By definition of W c , vt an d the strict convexity of C we have 

Wc(ji, v t ) < C C(x - ((1 - t)T (x) + tT^x))) dp(x) 



<(l-t) [ C(x-T (x))dp(x) + t [ C{x - Tt(x)) dfi(x) 
Jo Jo 

= {l-t)W c {p,is) + tW c (p,p) 

with a strict inequality if t £ (0, 1) and ^7^/5, 

• By construction 

/ v du t = / v(T t (x)) dp(x) 
Jo Jo 

which is convex with respect to t, by convexity of v , 

• Similarly 

(j)(y, z) du t (y) dv t (z) = 11 4>(T t (x), T t (y)) dp(x) dp(y) 
f [o,i] 2 JJ[o,i} 2 

is convex with respect to t, by convexity of (f>, 

• The convexity of the remaining congestion term is more involved. Since vt = 
Tt#pt and T is non-decreasing, at least formally 6 we have f((T(x))T/(x) = 
p(x), by the change of variables formula we also have 

1 F(u t (y)) dy = jf 1 F(u t (T t (x)))T{(x) dx = £ t(^)t/(x) dx 

and we conclude by observing that a t— >■ F(p(x)a~ 1 )a is convex and that 
T/(x) is linear in t. 



6 see Ambrosio et al. [2005], Villani [2003, 2009] for a rigorous justification. 
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Under the assumptions above, Jn is therefore strictly displacement convex and 
thus admits at most one minimiser (indeed if v and p were different minimisers, 
by strict displacement convexity, one would have ^[^1/2] < \3ii\ u \ + \Ji\p\)- 
Actually more is true (see Appendix B.5 for details): if 7 is an equilibrium then its 
second marginal solves (3.5) and therefore is unique. Since c satisfies the generalised 
Spence-Mirrlees condition (see Corollary 2.3), we deduce the following uniqueness 
result 

Theorem 4.2 (Uniqueness of an equilibrium by displacement convexity in dimen- 
sion one) Under the assumptions above, we have the equivalence 

v is a minimiser to (3.5) and 7 G n o (//, is) 7 is an equilibrium 

and since is strictly displacement convex, there is uniqueness of the equilibrium 
(which is actually necessarily pure). 

4.2. Hidden convexity under quadratic cost 

The arguments of the previous paragraph can be generalised in higher dimensions 
when the transport cost is quadratic. Throughout this section, we will assume the 
following: 

• X = Y = Q where O is some open bounded convex subset of R d , 

• /i is absolutely continuous with respect to the Lebesgue measure (that will 
be the reference measure mo from now on) and has a positive density on Q, 

• c is quadratic i.e. 

c(x,y) :=^\x-y\ 2 , (x, y) G R d x R d , 

• V again takes the form 

V[u]{y) = f(u(y)) + v{y) + j cj>{y, z) dv{z) 

where v is convex, / satisfies the assumptions of Section 3.3 and (j) G C(R d x 
M d ) is symmetric and C, ' (i.e. C l with a locally Lipschitz gradient). 

Again denoting by F the primitive of / that vanishes at 0, the corresponding 

energy reads 

S[u] = J y F{v{y)) dy + J v(y) dv(y) + X - JJ <j>(y, z) dv(z) dv{y). 

Note that as c is quadratic, Brenier's Theorem (see Theorem A. 2) implies the 
uniqueness and the purity of optimal plans 7 between \i and an arbitrary v. The 
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variational problem (3.5) then takes the form 

(4.1) M_Jp[v] where J^v] := v) + £[u] 

with Wf(/U,z/) is the squared- 2- Wasserstein distance between fi and v i.e.: 

Wf(Ai,z/):= inf / \x - y\ 2 dj(x, y). 

Two more structural assumptions are needed to guarantee the strict convexity 
of along generalised geodesies with base \i (see Appendix A for details), namely 
McCann's condition: 

(4.2) v i— > v d F(v~ d ) is convex non-increasing on (0, +oo) 

and that 4> is convex. Note that McCann's condition is satisfied by power functions 
with an exponent larger than 1 as well as by the entropy F(y) = z^log(z^). 

In contrast with Theorem 3.8 where minimisers are equilibria but the reverse is 
not always true, convexity along generalised geodesies ensures the converse recip- 
rocal property: 

Theorem 4.3 (Equilibria and minimisers coincide, uniqueness and regularity un- 
der generalized convexity) Under the assumptions above, we have the equivalence 

v is a minimiser to (4.1) and 7 G n o (/x, v) 4=> 7 is an equilibrium 

Moreover, there exists a unique equilibrium (which is actually pure) and the second 
marginal v of this equilibrium has a continuous density. 

4.3. A partial differential equation for the equilibrium 

In the quadratic cost framework of Paragraph 4.2, our aim now is to write the 
optimality condition (3.10) in the form of a nonlinear and non-local equation partial 
differential equation of Monge- Ampere type. For computational simplicity, we take 
v = and f{v) = \og(v) but any convex, C^, symmetric v and any increasing / 
satisfying McCann's condition would lead to a similar partial differential equation. 
Let us recall that the unique minimiser /equilibrium v satisfies (3.10) and is actually 
characterised by this condition. In this equation, the less explicit term is (p c . Thanks 
to Brenier's theorem (see Appendix A), this term can be made more explicit, as 
follows: the Brenier map T between [i and v is the gradient of some convex function, 
T = Vit with u convex, and similarly the Brenier map between u and /i is Vu* 
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where u* is the Legendre transform of u. In the case of a quadratic cost, u and u* 
are related to the Kantorovich potential <p and its c-transform (p c through 

(4.3) <p(x) = ^\x\ 2 -u(x), (p c (y) = -\y\ 2 -u*(y), V(x, y) G Q x ft . 

When Vu is smooth enough, the Monge- Ampere equation, see (A. 6), reads: 
H(x) = det(D 2 u(x)) v(Vu(x)), Viefi 
which has to be supplemented with the natural sort of boundary condition 

(4.4) Vu(n) = a 

We may then rewrite the optimality condition (3.10) as 

(4.5) u(y) = Cexp (-\\y\ 2 + v*(y) - (f>(y, z) dv{z)^J 

where C is a normalisation constant that makes the total mass of the right hand 
side be 1 on fi. Since u is defined up to an additive constant, one may actually 
choose C = 1. As V«#/i = u, we first have 

/ 4>(Vu{x),z) du(z) = / (f)(Vu(x),Vu(z)) dfi(z) . 
Jy Jn 

On the other hand, using the well-known convex analysis identity 

u*(Vu(x)) = x ■ V«(i) — u(x) 

and performing the change of variable y = Vu(i) in (4.5), the Monge- Ampere 
equation then becomes 

(4.6) fi(x) = det(D 2 u(x)) exp ^-^|Vn(x)| 2 + x ■ Vu(x) - u(x)j x 

exp ^— J (f)(Vu(x), Vti(z)) dfi(z) 

The equilibrium problem is therefore equivalent to a non-local and nonlinear partial 
differential equation. 

This kind of partial differential equation is rather complicated. However, in di- 
mension 1, i.e. when f2 is an open interval, which we can assume to be (0, 1), the 
boundary condition (4.4) is u'(0) = 0, n'(l) = 1 and the Monge-Ampere equa- 
tion (4.6) simplifies to 

/j(x) = u"(x) exp ( — u'(x) 2 + x ■ u'(x) — u(x) — [ (j)(u'(x), u'(z)) dfi(z) ] . 



2 



(0,1) 
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Note that this differential equation automatically implies the strict convexity of 
u. We do not know whether this equation can be solved numerically in an iterative 
way but will see how the equilibrium can be computed in one dimension thanks to 
a convenient reformulation of (4.1) as explained below. 



4.4. Numerical computations in dimension one 

Let O = (0, 1), rriQ be the Lebesgue measure on (0, 1), fi be absolutely continuous 
with respect to too with a positive density still denoted \x. Consider again 7 the 
variational problem 

(4.7) MJJv] 



where J^u] := ^W 2 2 (/i, v) + i/log v + i JJ^ 



(j){y,z) dv{y) dv(z) 

[o,i] 2 

where <j) is C\', convex and symmetric. Looking for v amounts to look for its 
rearrangement or quantile function: 

G(x) := inf{A : i/([0, A]) > x}, Vx € (0, 1). 

Note that G is non-decreasing and G#mo = v. We also denote by H the quantile of 
fx. It is then well known (see [Villani, 2003, Section 2.2] or [Ambrosio et aL, 2005, 
Theorem 6.0.2]) that 

W|(/i,i/)= f 1 \G(x) - H(x)\ 2 dx. 
Jo 

Moreover since G^tuq = u, we have 

<Ky, z) Mv) M*) =H HG(x),G(6)) dx dO . 
'[o,i] 2 ii[o,i] 2 

And since v is regular the change of variable formula yields 

v(x) log(u(x)) dx = f u{G{x)) \og(v(G(x))) G'{x) dx = - f log(G'(a?)) dx , 



o 



as, by definition of G, v(G(x)) G'(x) = 1 a.e. 

Therefore reformulating (4.7) in terms of quantile consists in minimising the 
strictly convex functional 



\ f \G-H\ 2 - f\og{G'{x))dx + \ [[ 



</>(G(x),G(e)) dx d6 

[0,1] 2 



7 Again the choice of a logarithmic congestion function is not so essential, power congestion 
functions can be considered as well. 
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subject to the boundary conditions G(0) = 0, 67(1) = 1. The discretisation of this 
variational problem is easy to solve using standard gradient descent methods, see 
Figure l 8 . 



10 12 14 16 



FIGURE 1. — The distribution /x of the agents is dash line and the solution v to (4.7) 
in the case f(x) = x 8 , 4>{z) — 10 _4 |z| 2 and v = (x — 10) 4 . 



5. CONCLUDING REMARKS 

We conclude the paper, by two remarks, for the sake of simplicity, we adopt 
exactly the same framework and notations, as in Section 4.2. 



5.1. Implementation by taxes 

By Theorem 4.3 the unique equilibrium is the unique minimiser of the functional 
Jp,. It would therefore be tempting to interpret this result as a kind of welfare 
theorem. A simple comparison between and the total social cost tells us however 
that the equilibrium is not efficient. Indeed, the total social cost SC[z/] is the sum 

8 Actually, in our simulations, we have relaxed the condition that the support of v is [0, 1] i.e. 
the boundary conditions G(0) = 0, G(l) = 1. 
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of the transport cost Wf (Mi ^)/2 and the additional cost f Y V[v](y) v(y) dy i.e. 

SCM = ^WiM + J f(y(y)) du(y) + J v dv + J J <j>(y,z) du{y) du(z) . 

The second term represents the total congestion cost and the last one the total 
interaction cost. The functional whose minimiser is the equilibrium has a similar 
form, except that in its second term f{y}v is replaced by F{y) (with F' = /) and 
the interaction term is divided by 2. The equilibrium corresponds indeed to the 
case where agents selfishly minimise their own cost 

c(x, .) + V[u] = c(x, .) + f{v{.)) + v + J tf>(y, z) dv(z) . 

This individual minimisation has of course no reason to correctly estimate the 
marginal effect of individual behaviour on the total social cost. In other words, 
there is some gap between the equilibrium and the efficient (social-cost minimising) 
configurations, and, since we are dealing with a situation with externalities, this 
is actually not surprising. The computation of the equilibrium and the optimum 
can be done numerically in dimension 1 by using the same kind of numerical 
computations as explained in Section 4.4, see Figure 2. 




FIGURE 2. — The optimum in continuous line and the equilibrium in dash line on the 
left. The corresponding taxes on the right. 

The natural way to restore efficiency of the equilibrium is the design by some 
social planner of a proper system of tax/subsidies which, added to V\v\, will imple- 
ment the efficient configuration (or at least a stationary point of the social cost). 
Thanks to our variational approach, a tax system that restores the efficiency is 
easy to compute (up to an additive constant): 

TaxM(y) = f(u(y)) u(y) - F{v{y)) + J <j>{y, z) du(z). 
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The two terms in Tax[z/] represent respectively a correction to the individual esti- 
mation of congestion cost and to the individual estimation of interaction cost. A 
similar inefficiency of equilibria, arises in the slightly different framework of con- 
gestion games, where it is usually referred under the name cost of anarchy, which 
has been extensively studied in recent years (see Roughgarden [2005] and the ref- 
erences therein). In our Cournot-Nash context, we may similarly define the cost 
of anarchy as the ratio of the worst social cost of an equilibrium to the minimal 
social cost value: 

max{SC[z^ e ] : u e equilibrium} 

Cost of anarchy := . 

mnij, SC^J 

In the previous numerical example of Figure 2, both the equilibrium and the op- 
timum are unique and the cost of anarchy can be numerically computed as being 
approximately 1.8. 



5.2. A dynamical •perspective 

Instead of minimising directly, we may think that agents start with some 
distribution of strategies (that is not an equilibrium) and adjust it with time by a 
sort of gradient descent dynamics to decrease their individual cost dynamically. At 
least formally, a way to reach the equilibrium (or minimiser of J^) is then to put it 
into the dynamical perspective of the minimising movement scheme as follows. Fix 
a time step r > and start with an initial configuration of strategies v$. The first 
step of the minimising movement scheme selects a new distribution of strategies 
v\ close to vq (in W2) but also decreasing by 

v\ G argmin„ { — Wf (fo, v) + J^v] 



, 2t 

And then it iterates the process by choosing 



(5.1) v k+x G argmim, <J — Wfak, v) + J^v] 



This sort of Wasserstein Euler scheme was first introduced in Jordan et al. [1998] 
for the Fokker-Planck equation. Under suitable conditions it is possible to pass 
the continuous limit r — > + in the minimising scheme (5.1) and prove that the 
solution converges in some sense to the continuous evolution equation 

^ + div (_,v(^))=o, 

v t =o = u 
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which is the gradient flow of Jn in the Wasserstein space (see Ambrosio et al. [2005] 
for a detailed exposition of the theory) . By construction is a Lyapunov function 
of this equation and even though the equation may have non-unique solutions, by 
Lyapunov theory, it can be shown under appropriate conditions that its trajectories 
converge in large time to the unique minimiser of Jn i.e. the equilibrium. If we go 
back to the individual level, it can be shown that the equation above corresponds 
to the fact that each agent modifies her strategy according to the gradient flow of 
her individual cost. 

We obtain a sequence of densities which converges to the equilibrium, see Fig- 
ure 3. The descent algorithm is very fast and the computed equilibrium is very 
stable with respect to the initial density for the gradient descent as shown in the 
left hand figure of Figure 4. 




FIGURE 3. — Convergence and stabilisation toward the equilibrium in the case of a 
logarithmic congestion, cubic interaction, and a potential v (x) :~ (x — 5) 3 with l[o,i] as 
initial guess. The inverse of the cumulative function on the left and the corresponding 
density on the right. 
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APPENDIX A: THE OPTIMAL TRANSPORT TOOLBOX 

This appendix just gives some basic results from optimal transport theory that we have used 
in the paper, for a detailed exposition of this rich and rapidly developing subject, we refer the 
interested reader to the very accessible textbook Villani [2003] or Ambrosio et al. [2005], Villani 
[2009] or, the more probability-oriented textbook Rachev and Riischendorf [1998]. 
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FIGURE 4. — Convergence and stabilisation toward the equilibrium in the case of a 
logarithmic congestion, cubic interaction with an initial guess made of two bumps. The 
potential is v(x) := (x — 5) 3 on the left and v(x) := (x — 1/2) 3 on the right. 



Kantorovich duality 

Let X and Y be two compact spaces equipped respectively with the Borel probability measures 
p G V(X) and v G V(Y). For fi G V(X) and T, Borel: X -t Y, T#fi denotes the push forward (or 
image measure) of fi through T which is defined by T#fi(B) = (J,(T~ (B)) for every Borel subset 
B of Y or equivalently by the change of variables formula 

(A.l) / vdT #M = / tp{T(x)) dfj,(x), V</> € C(X). 
Jy Jx 

A transport map between jj, and v is a Borel map such that T#fi = v. Now, let c G C(X x Y) 
be some transport cost function, the Monge optimal transport problem for the cost c consists in 
finding a transport T between /j, and v that minimises the total transport cost J x c(x, T(x)) d/j,(x). 
A minimiser is then called an optimal transport. Monge problem is in general difficult to solve (it 
may even be the case that there is no transport map, for instance it is impossible to transport 
one Dirac mass to a sum of distinct Dirac masses), this is why Kantorovich relaxed Monge's 
formulation as 

(A.2) Wc(p,v):= inf / c(x, y) &y{x, y) 
7en(/z,i/) J XxY 

where II (fi,u) is the set of transport plans between fi and v i.e. Borel probability measures on 
XxY having fi and v as marginals. Since II(^, v) is weakly * compact and c is continuous, it is easy 
to see that the infimum of the linear program defining W c (/i, v) is attained at some 7, such optimal 
7's are called optimal transport plans (for the cost c) between /1 and v. If there is an optimal 7 
which is induced by a transport map i.e. is of the form 7 = (id,T)#/x for some transport map 
T then T is obviously an optimal solution to Monge's problem. Another advantage of the linear 
relaxation is that it possesses a dual formulation that can be very useful. This dual formulation 
consists in maximising the linear form f x <p d/i + J y i/i among all pairs (tp, if)) G C(X) x C(Y) 
such that <p(x) + ip(y) < c(x, y), it is easy to see that this can be reformulated as a maximisation 
over ip only: 

(A.3) W c (fi,v):= sup { / tpdfj,+ tp c Au\ 
<pec(x) L Jx Jy ' 
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where tp c is the c-concave transform of tp i.e. 

tp c (y) := mm{c(x,y) - tp(x)}, Vy £ Y. 

Formula (A. 3) is usually called Kantorovich duality formula and a maximiser tp in (A. 3) is called 
a Kantorovich potential between p and v for the cost c. The existence of Kantorovich potentials 
under our assumptions is well-known (see Rachev and Riischendorf [1998], Villani [2003, 2009]) 
and we observe that if tp is a Kantorovich potential then so is tp + C for every constant C. 

We have used in Section 3 the following result on the uniqueness of the Kantorovich potential 
and the differentiability of W c (p,, u) with respect to v: 

Lemma A.l Assume that X = Q, where Q is some open bounded connected subset of R d with 
negligible boundary, that p is equivalent to the Lebesgue measure on X (that is both measures have 
the same negligible sets) and that for every y £ Y, c(.,y) is differentiable with X7 x c bounded on 
X x Y , let v £ V(Y) then there exists a unique (up to an additive constant) Kantorovich potential 
tp between p and v and for every p £ V(Y) one has 

e^0+ E Jy 

Proof: The proof of the uniqueness of the Kantorovich potential tp between p and v up to an 
additive constant can be found for instance in [Carlier and Ekeland, 2007, Proposition 6.1]. As a 
normalisation we choose the potential tp such that tp(xo) — where xo is some given point of X. 
To shorten notations, set u e = v + e(p — v), thanks to Kantorovich duality formula (A. 3) we have 

(A.4) e -1 [WcOi, Ue) - W c (p, v)] > j tp c d(p - v) 

and similarly if tp e denotes the Kantorovich potential between p and v E such that tp E (xo) = 0, we 
have 

e - 1 [W c (ftc £ )-Wc(/i,!')]< [ <ptd(p-v). 



JY 

Now it is well-known that (tp £ )e is bounded and uniformly equi-continuous uniformly with respect 
to e hence, thanks to Ascoli's Theorem, up to a sub-sequence, it converges uniformly to some 
Tp £ C(X) such that tp(xo) = and it is easy to see that Tp is a Kantorovich potential between p 
and v so that tp — Tp and tp% converges to tp c . We then have 

limsupe _1 [>V c (At, v e ) — W c (p, v)\ < I tp c A(p-u). 
£ ^o+ Jy 

The desired result thus follows from (A.4). Q.E.D. 

When X = Y and denoting by d the distance on Y , for p £ [1, +oo[, the p-Wasserstein distance 
between p £ V(X) and v £ 'P(X) is by definition 



(A.5) W p (p,v)~( inf { / d(x,yf dj(x,y)\) 
V 7 Gn( M ^) L Jxxy ' ' 



t/p 



The Wasserstein distances are indeed distances and they metrise the weak * topology of V(Y). 
For p = 1, it is well-known that the Kantorovich duality formula can be rewritten as 

Wi(/i, v) — sup | j tp d(p — v) : tp 1-Lipschitz| 
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so that for every Lipschitz continuous function tp on X, one has 




tpd(jj,-v)\ < Lip(tp,X)Wi(p,v), 



an inequality we will use several times later on. As a simple illustration of the interest of the 
distance VVi, let us equip Y m with the distance (x,y) = (asi, . . . ,x m ,yi, . . .y m ) n- d m (x,y) := 
X^feLi d(xk,yk) for v and 9 in V(Y), let 7 be an optimal transport plan between fj, and f for Wi 
then since 7®" 1 := 7 ® • • • (g> 7 has marginals i/®" 1 and #® m , we have 



Which shows in particular that if (tVi)n weakly * converges to v then (i-'® m )n weakly * converges 
to v® m i.e. j ym if du® m converges to f ym <p &u® m asn^oo for every <f> £ C(Y m ). 

Of particular interest is also the quadratic case p = 2 in an euclidean setting for which a brief 
summary of the main results used in the paper is given in the next paragraphs. 



We now restrict ourselves to the quadratic case, the solution of the quadratic optimal transport 
problem is due to Yann Brenier whose path-breaking paper Brenier [1991] totally renewed the 
field of optimal transport and was the starting point of an extremely active stream of research 
since the 90's. 

Theorem A. 2 (Brenier's theorem) Let fi £ T-^R ) be absolutely continuous with respect to the 
Lebesgue measure and compactly supported and v € P(M ) be compactly supported, then the 
quadratic optimal transport problem 



possesses a unique solution 7 which is in fact a Monge solution 7 = (id, T)#p. Moreover T = Vit 
fi-a.e. for some convex function u and X7u is the unique (up to /i-a. e. equivalence) gradient of a 
convex function transporting p, to v; T — Vu is called the Brenier map between fi and v. 

In fact the previous theorem holds under much more general assumptions (it is enough that p 
and v have finite second moments and that [i does not charge sets of Hausdorff dimension less 
than d — 1, see McCann [1995] or Villani [2003]). Brenier's theorem roughly says that there is a 
unique optimal transport for the quadratic cost and that it is characterised by the fact that it is 
of the form Vu with u convex, in other words, solving Vu^(i = v with u convex determines Vu 
uniquely ^i-a.e.. When we have additional regularity, i.e. when /i and u have regular densities (still 
denoted fj, and v) and Vu is a diffeomorphism between the support of jj, and that of v, thanks 
to the change of variables formula, we find that u solves the Monge-Ampere partial differential 
equation: 

(A. 6) n = (/(Vu) det(D 2 u). 

A deep regularity theory due to Luis Caffarelli: Caffarelli [1992a, b] implies that the Brenier map 
is a smooth diffeomorphism when in addition pi and v are smooth, bounded away from and have 
convex supports, in particular the Monge-Ampere equation is satisfied in this case which justifies 
the computations of Section 4.3. 




The quadratic case and Monge-Ampere equation 
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Convexity along generalised geodesies 

The last ingredient from optimal transport theory that we have used (in Section 4) is the 
powerful notion of displacement convexity along generalised geodesies due to Ambrosio, Gigli- 
Savare Ambrosio et al. [2005] 9 . As in Section 4.2, we assume that X = Y = where $1 is some 
open bounded convex subset of R d , that the cost is quadratic, that mo is the Lebesgue measure 
on X and that p is absolutely continuous with respect to mo and has a positive density on Q. In 
particular for every v £ V(X), the Brenier's map between p, and v is well-defined. Generalised 
geodesies with base p for the Wasserstein distance W2 and the corresponding notion of convexity 
are defined as follows 

Definition A. 3 (Convexity along generalised geodesies) Let v e V(X), p G V(X), let To be 
the Brenier's map between p and v and let Ti be Brenier's map between p and p, the generalised 
geodesic with base p between v and p is the curve of measures t 6 [0, 1] h-> v t := ((1 — t)To + tTi)#p. 
The functional J : V(X) -> RU {+00} is called convex along generalised geodesies with base p if 
for every pair of endpoints v and p in V(X) and for every t G [0, 1], one has 

J[v t \ < (l-t)J[v]+tJ[p]. 

If, in addition, the previous inequality is strict for t € (0, 1) and p 7^ v , J is called strictly convex 
along generalised geodesies with base p. 

In Section 4.2, we were interested in the strict convexity along generalised geodesies with base 
p of the functional defined by (4.1). As in Paragraph 4.1, the convexity of 

/ 4>(y, z) dvt(y) dvt(z) and (rt I v du t 
JYxY Jy 

directly follows from the convexity of (f> and v respectively. As for the convexity of 1 1->- J Y F{v t (y)) 
under McCann's condition: 

(A. 7) v M> v d F(v~ ds ) is convex non-increasing on (0, +00), 

it follows from [Ambrosio et al., 2005, Proposition 9.3.9]. Finally, in the functional defined by 
(4.1), we had the term VV| (p, ■), to see that it is strictly convex along generalised geodesies with 
base p, we can proceed exactly as we did for W c (p, ■) in dimension one in Paragraph 4.1. 

APPENDIX B: PROOFS OF THE RESULTS 

B.l. Proof of Theorem 3.2 

Let vbea solution of (3.5), p € T> and e € (0, 1), we then have e~ 1 (J' li [v + E(p — v)] — J' fl .[u]) > 0. 
Using the fact that V[v] is the first variation of £ and Lemma A.l, we thus get 

(B.l) J (v5 c + V[v]) dp > J (if c + V[v}) dv 

9 Actually this notion of convexity is a slight variant of the notion of displacement convexity 
which first appeared in the seminal work of McCann [1997]. It is known that Wf (p, v), as a function 
of v is not displacement convex in the sense of McCann (see example 9.1.5. in Ambrosio et al. 
[2005]) and this is the very reason why, following Ambrosio, Gigli and Savare, we consider convexity 
along generalised geodesies with base p rather than the initial notion of McCann. Let us however 
indicate that, in dimension one, both notions coincide. 
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where ip is a Kantorovich potential between p and v (it is unique up to an additive constant by 
Lemma A.l and this constant plays no role since p and v have the same mass). Minimising the 
left-hand side of (B.l), with respect to CP probabilities p, this yields that u-a..e. 

ip c + V[v] = inf / (tp c + V[v\) dp = M 

where M := Essinf(<^ c + V[u]) denotes the essential infimum of (<^> c -fV[^]) i.e. the largest constant 
that bounds (</? c + V[^]) from below mo-a.e.. Since 7 is an optimal transport plan we have c(x, y) = 
f(x) + ^(y) 7-a.e. whereas by definition c(x,z) > tp(x) + <p c (z) for all (x,z) £ X xY. We thus 
have 

c(x, z) + V[u](z) > M + ip(x) for all x £ X and mo-a.e. z £ Y 
c(x, y) + V[v](y) — M + <p(x) for 7-a.e. (x,y). 

This proves that 7 is a Cournot-Nash equilibrium. 

B.2. Proof of Corollary 3.3 

Thanks to Theorem 3.2, it is enough to prove that (3.5) admits solutions and to recall that 
the set n o (/i, v) is nonempty. Let {v n ) n be a minimising sequence of (3.5). Thanks to the growth 
condition (3.2), {u n ) n is bounded in £ p (mo). It thus admits a (not relabelled) sub-sequence that 
converges weakly in £ p (mo) (and thus in particular weakly * in V(Y)) to some v £ £ p (mo). 
By the convexity of F(y,.), the first term in £ is lower-semi continuous for the weak topology 
of £ p (mo). By the continuity of <j> the second term in £ is continuous for the weak-* topology 
of V(Y). Finally, the lower-semi continuity of W c (/i, •) for the weak-* topology straightforwardly 
follows from the Kantorovich duality formula (A. 3). We thus have 

w£J»(v) = liminf {W c (p,u n ) +£[v n ]} > W d ((jl,u) + S[v] . 

v n 

So that v solves (3.5). 

B.3. Proof of Proposition 3.5 

Assume that 7 is an equilibrium and let v be its second marginal. Let then ip be a Kantorovich 
potential between p, and v such that 

f P c + V[u] > mo-a.e. 
\ <p c + V[u] = i/-a.e. 

Let p £ T>, thanks to the Kantorovich duality formula (A. 3), we first have 

W c (p, p) - W c (/i, v) > <p c d(p - v) . 
By convexity of £ and (3.4), we obtain 

£[p\-£[p)> J^V{u] d(p-u) 
hence, finally using (B.2) and the fact that p is absolutely continuous with respect to mo, we get 

JM - JM > f Y (^ c + VM) d( P -u)>o 

which means that v solves (3.5). 
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B.4. Proof of Lemma 3. 7 

The existence of a minimiser is similar to the proof of Corollary 3.3 since the coercivity of F 
ensures that minimising sequences are uniformly integrable and its convexity guarantees sequential 
weak lower semi continuity of J7" M . 

Let v solve (3.5) and let us prove that it is bounded away from zero. Let us assume by contraction 
that mo({v < A}) > for every A > 0. Let So > 0, S G (0, So) (to be chosen later on). Let 
A := {x : So < v(x) < A/} where M > is large enough so that mo(A) > 0. For small e > such 
that e < Smo(A)/2 then define 

Ve ■= v + s(ua s — Ua) 

where As := {y < S} and for mo(B) > 0, its denotes the (sort of uniform probability on B) 
ub '■= mo(B)~ 1b- Since s < 5mo(A)/2 and v > 5 on A, v E is a probability measure. By 
optimality of v, we then have 

(B.3) < Mu E ] - JM = Wcifl, Ve) - Wc(fi, v) + £[u e ] -S[u]. 
Denoting by tp e a Kantorovich potential between /i and v e , we first have 

W c (/i, Ve) - W c (/i, v) < e J ipl d(u As - u A ) ■ 

And since Lp% has a modulus of continuity that is uniform with respect to e (that of c) and ip% 
can be normalised so as to vanish at the same point, tp c E is uniformly bounded independently of 
e and 5. So that W c (^i, v) — W c (/i, v e ) < C\£ for some constant G\. In a similar way, one finds a 
constant C2 such that for e small enough and uniformly in 5 one has 



ML 



4>{y,z) dv e {y) &{v £ - u){z) < Ci£. 

Now it remains to estimate the last term namely 

/ \F(y e ) - F(u)] dm = [ [F{v + £mo{A s y l ) - F{v)} dm 
Jy Ja s 

+ [ [F(v - smoiA)- 1 ) -F(v)] dm 

J A 

since F is Lipschitz on [So/2, M] the second term can be bounded from above by Cz,£ for a constant 
C3 again independent of 5 and e. Now let C := Ci + C2 + C3 thanks to Inada's condition there 
is some Si < So/2 such that / < — C — 1 on (0, 2Si]. Choosing 5 < <5i and e small enough so that 
£mo(As)^ 1 < Si, we then have 



[F(v+ emo(As) ') - F(u)] dmo < (-C - l)s 



Putting everything together, the latter inequality gives the desired contradiction to (B.3). 

The proof of the upper bound is similar: one assumes that v £ £°°(mo) and then considers a 
perturbation of the form v e := v+e{uc — uc M ) with Cm := {y > M}, M large and C well chosen, 
the computations are the same as before and the contradiction comes from the Inada condition 
at +00: F'(y) — f(y) — > +00 as v — > +00. 
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B.5. Proof of Theorem 4.3 

Uniqueness of a minimiser follows directly from the strict convexity along generalised geodesies 
with base p of v7 M which follows from McCann's condition (4.2), the convexity of v and <f> and the 
strict convexity along generalised geodesies with base p of W|(/i, .)• 

Let us now assume that 7 is an equilibrium and that v is its second marginal, then, for some 
constant M, we have: 

(B.4) f{ v ) + v (y) + J 4>{Vi z ) dv(z) + ip c > M a.e. with an equality v-a.e. 

Thanks to Inada's condition and the fact that the right hand side is continuous, this implies v 
is bounded away from zero so that (B.4) actually is an equality (Lebesgue) almost everywhere 
V[v] + ip c = M and thus v satisfies 



v(y) = f 1 \ M - v(y) - <t>(y, z) dv(z) - <p G (y)\ 



and is therefore continuous. 

Let us now prove that v solves (4.1), let p be another probability measure (which we can assume 
to have a positive and continuous density as well), and let t € [0, 1] —¥ v t denote the generalised 
geodesic with base p joining v and p, i.e. ut = Tt#p := ((1 — t)To + tTi)#/x where To (resp. T\) 
denotes the Brenier map between p and v (resp. p). Since (To, Tq + t(Ti — To))#/U has marginals 
v and vt we have 



(B.5) Wi(y, v t ) < J \T t - T 1 dp < tdiam(F) 



Wl(p,v f ) + / F(vt)dm ~-Wl(p,v)- / F(v)dm ) > / [f (v) + V c ]ht dm 



By the convexity of along generalised geodesies with base p, setting g(t) = Jti[vt\ and using 
g(t) < (1 - t)g(0) + tg(l) for all t £ (0, 1) we have: 

JM - JM = 5(1) ^ 5(0) > \\g{t) - 9(0)] = \{Mn] - JM)- 

Let us write vt as v t := v + tht, by the (usual) convexity of F and that of Wf (p, ■), we first have 
1 

T 

Let us now expand JJ Y2 4>(y, z ) dvt(y) dvt(z) in powers of t as 

j t jj 2 4>{V, z) d [(v + th t )(y)(v + th t )(z) - v(y)v(z)] = J (J cj>{y, z) dh t {y) dv(z) + R t 

where 

2 I-R*I = ^| J J d(v t -v)(y) d{v t -v){z)\ < jWi(v t , v) xLip(^t, Y) < diam(F) Lip(^ t , Y) 

where the last inequality follows from (B.5) and ipt is defined by 

Mv) ■■= f <Kv,z) d(v t - v)(z) = f (<Kv, T t (x)) - <t>{y,T {x))) dp(x) . 

Since V0 is locally Lipschitz and T t — To is uniformly bounded by tdiam(Y) we find that there 
is a constant C such that Lip(^t, Y) < Ct so that Rt = 0(t). Putting everything together and 
using (B.4) we get 

JM - JM > f UHv)) + "(v) + f <Kv, z) M*) + v c (y)) dh t (y) + Rt 

= J (V[v] + (p c ) dh t + Rt > M J dh t + Rt = -Rt as t -> 0+ 



32 



A. BLANCHET & G. CARLIER 



which proves that v is a minimiser. 
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